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Abstract 



We present a new approach to the bootstrap for chains of infinite order 
taking values on a finite alphabet. It is based on a sequential Bootstrap 
Central Limit Theorem for the sequence of canonical Markov approximations 
of the chain of infinite order. Combined with previous results on the rate of 
approximation this leads to a Central Limit Theorem for the bootstrapped 
estimator of the sample mean which is the main result of this paper. 

1 Introduction. 

In this paper we introduce a new procedure of bootstrap resampling for chains 
on a finite alphabet whose transition probabilities depend on the whole past. 
This resampling uses the excursions of the chain between successive occur- 
rences of the initial string of k symbols as building blocks for the bootstrap 
sample. The bootstrap sample is obtained by concatenating randomly cho- 
sen blocks. These blocks are chosen uniformly and independently among the 
first mk excursion blocks. For chains which lose memory exponentially fast 
we prove a Central Limit Theorem for the empirical mean of the bootstrap 
sample, when the length k of the initial reference string as well as the number 
of excursion blocks diverge with a suitable relation between them. This 
is the main result of the article. 

The idea behind our procedure is that a typical large sample of the chain 
of infinite order behaves essentially as a sample of a Markov chain of or- 
der k suitably chosen. The Markov property of the approximating chain 
implies that the successive excursion blocks are independent and identically 
distributed. This makes it possible to construct the bootstrap sample by 
simply concatenating randomly chosen blocks, exactly as proposed in the 
original paper by Efron (1979) for the case of i.i.d. random variables. 

This idea has already been exploited in the case of Markov chains in 
Athreya and Fuh (1992). For chains of infinite order with different types of 
mixing conditions, different approaches to the bootstrap have been proposed 
in the papers by Calrstein (1986) and Kiinsch (1989) and thoroughly studied 
in the recent literature, see for example Liu and Singh (1992), Shao and Yu 
(1993), Naik-Ninbalkar and Rajarsh (1994), Buhlmann (1994) and Peligrad 
(1998). 

Chains of infinite order seem to have been first studied by Onicescu and 
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Mihoc (1935a) who called them chaines a liaisons completes. Their study 
was soon taken up by Doeblin and Fortet (1937) who proved the first results 
on speed of convergence towards the invariant measure. The name chains of 
infinite order was coined by Harris (1955) . Our proof is based on the upper 
bound on the rate of approximation of the chain of infinite order by the 
sequence of canonical Markov approximations presented in Fernandez and 
Galves (2002). We also use the yj-mixing property of the chain of infinite 
order proven in Bressaud, Fernandez and Galves (1999). We refer the reader 
to Iosifescu and Grigorescu (1990) for a complete survey, and to Fernandez, 
Ferrari and Galves, 2001) for an elementary presentation of the subject from 
a constructive point of view. 

The rest of the paper is organized as follows. In section |2] we introduce 
the notation and the definitions and state the main results. In section |3] we 
collect together a few technical results which will be used in the proof of the 
theorems. In section 0] we prove a central limit theorem for the sequence of 
canonical approximating Markov chains. Finally in section El we prove the 
main result which is a bootstrap central limit theorem for the empirical mean 
of a chain of infinite order. 

2 Notations, definitions and statement of the 
main result. 

Let (X n ) n£ x be a stationary process taking values on a finite alphabet A. We 
will use the shorthand notation 

p(x \X-!, X_ 2 , • • •) = ^(^0 = X \X-! = X_ 2 = 37-2, • • • ) 

to denote the regular version of the conditional probability of the process. 
To avoid long formulas, whenever convenient, we will use the notation do,/ to 
denote the sequence (a , . . . , a{) of elements of A. We also use the notation 
{X n , n+[ = clqj} to denote the cylinder set 



Following Harris (1955), we call this process a chain of infinite order. 
We assume that (X n ) neZ satisfies the following hypotheses. 



{X n — do, 



■ ■ , X n+ i — a{\ . 



min inf 

aeA (...,x-2,x~i)eA, 



p(a\x-i, x-2, ■ ■ •) = 5 > , 



(2.1) 
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where A a = {(••• , x_ 2 , X-i) '■ p(a\x-i, x_ 2 , ■ ■ •) > 0}. 
H 2 

c = — lim sup - log Pi > , 

where 

A = sup \p(x Q \x-i,x-2, ■ ■ •) -p(yo\y-i,y-2, ■ ■ -)| ■ 

i=— Z,— ,0 

Let / : A r — > R be a real observable of the chain, where r is a fixed 
positive integer and denote 

/i = E(f(X 1 ,...,X r )) , 

the average value of the observable /. We are interested in the fluctuations 
of an estimator of /i. To simplify the presentation we can assume without 
loss of generality that r = 1, namely the cylinder function / through which 
we observe the chain depends only on one coordinate. 

To avoid uninteresting pathologies we will assume that the following third 
hypothesis holds 

H 3 

+oo 

a 2 = War (f(X )) + 2 £ Cov (f(X ), f(X 3 )) > . 

3=1 

We recall that hypotheses and H 2 imply that the chain {X n ) n& i is 
exponentially (/9-mixing (cf. Bressaud, Fernandez and Galves (1999)). This 
last property imply that the series defining a 2 is convergent (cf. for instance 
Theorem 19.1 in Billingsley 1999). However it is well known that this does 
not imply that a is strictly positive. 

Our bootstrap procedure is defined as follows. For any positive integer 
k, the sequence N of return times of the first string of length k is 

defined by 

R i+1 (k) = inf jn > R^k) : (X n , X^^) = (X 1 ,...,X k ) 
with R (k) = 1. 

Let £i(k) be the block of values of the chain from R^iik) up to Ri(k) — 1, 
namely 

fc(fc) = {X Ri _ l(k): ... : X Ri{k) _ 1 ). (2.2) 
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We will make a uniform i.i.d. selection of the first m blocks £i(k), . . . , £ m (fc) 
to construct a bootstrap sample of the chain. We will take m = m k as 
a diverging function of k to be fixed latter. This leads naturally to the 
construction of a sequence of bootstrap samples indexed by k. 

The formal definition is the following. For every k, let h(k), . . . , I mk {k) 
be rrik independent random variables with uniform distribution in the set 
{1, . . . , m k }. The bootstrap blocks are defined as 



£(*)=&,(*)(*), 
for I — 1, . . . , rafc. The bootstrap sample X*(k) 

by concatenating the bootstrap blocks £*(&),... , £m(fc)(^)- We observe that 



JTjj* ( fc )(A;) is constructed 



the return times of the bootstrap sample assume the values R^k) 
for I = 1, . . . , m fc 

= RU(k) + R h(k)+1 {k) - R m (k) . 
We consider the following sequence of estimators for /i 

Rm k {k)-1 



1 and 



1 



Its bootstrap counterpart is given by 

R*m k (k)-i 



(2.3) 



(2.4) 



71=1 



Let 



Ou = 



Yar 






R* mk (k)-l 



where Var denotes the variance. Observe that a\ is a function of the sample 
X\, . . . , XR mk (k) and therefore the above variance is taken with respect to the 
independent random variables Ii(k), . . . , I mk (k). 

In the statement of our theorems the number of blocks used in the boot- 
strap sample is 

m k (a) = [e ak ] , 
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where a is a positive real number to be suitably chosen latter and [•] denotes 
the integer part. To simplify the notation we will often write m k instead of 
m k (a) 

Theorem 2.1. Let (X n ) n& z be a chain of infinite order satisfying Hypothe- 
ses Hi, H 2 and H 3 and such that c > 181n(l/<5) , where 5 and c are 
the constants appearing in H x and H 2 , respectively. Then, for any a G 
(5 In (1/5) , c — In (1/5)), for m k = [e ak ], and for almost all realizations of 
the chain (X n ) n( zz, we have 



— {Vl-h) ^AT(0,1), (2.5) 

°k 

as k tends to +oo, where — > denotes convergence in distribution andAf(0, 1) 
denotes the standard normal distribution. 

The proof of Theorem 12.11 is based on the following sequential boot- 
strap procedure which is interesting by itself. Let (Xn) n eZi k = 1)2,... 
be a sequence of stationary irreducible aperiodic Markov chains of order 
k = 1,2,..., respectively, taking values in the same finite alphabet A with 
transition probabilities denoted by 

P {k) (a\ 6_ fc> _i) = P(X = a\ X^-x = • 

We may assume, without loss of generality, that the Markov chains (Xn) n &i 
for k = 1,2, . . . are all defined on the same probability space (cf. for instance 

0)- 

We define 



8^ — min inf p <yk \a\x-i, . . . ,X- k ) 
a6A (x_ fc ,...,x_i)e^i fc) 

and 



5 = inf{5 (fc) : k > 1} , (2.6) 

where Aa = {(x_ fc , . . . ,x_i) : p (fc) (a|x_i, . . . ,x_ k ) > 0}. 

For each k we define recursively the sequence of return times {Rj) - 6N 
by Rq = 1, and for i > 1 

Rf" = mf [n > R?\ : X^) = (x[ k \ X«) } . (2.7) 
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Let ^ be the block of values of the chain (Xn') ne z from Rf^ up to 



?(*) 



— 1, namely 



(fc) 



We construct a bootstrap sample of the Markov chain (Xn^)nei. by per- 
forming an i.i.d. selection of the blocks The formal definition is the 
following. For every k, let h(k), . . . , I mk (k) be m k independent random vari- 
ables with uniform distribution in the set {1, . . . , m^}. The bootstrap blocks 
are defined by 

Ak)* _ Ak) 

for I — 1, . . . , mfc. The bootstrap sample x\ k ^* , 1 = 1,..., R$ k * is constructed 
by concatenating the blocks . . . , £ml* ■ We observe that the return times 
of the bootstrap sample assume the values R =1 and for I — 1, . . . , 



R 



(*)* _ pW* , p(*0 p(fc) 

- -I- %,(*)+! %,(fc) 



We consider the following estimator for /i^ = E(/(x[^)) 

1 



R (fe) -1 



/?(*) _ I 



E • 



(2.8) 



Its bootstrap counterpart is given by 

1 



n=l 



(k)* 



ry(k)* _ 

*Mnk 1 n=l 



(2.9) 



We define 



.(*)* 









r£1* - 1 



(2.10) 



Recall that, as before, this variance is with respect to the independent random 
variables I^k),..., I mk (k). 
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Theorem 2.2. Let (xi k) )neZi k — 1,2, ... be a sequence of stationary, ir- 
reducible, and aperiodic Markov chains of order k = 1,2,..., respectively, 
taking values in the same finite alphabet A and satisfying the following hy- 
potheses 

5>0, (2.11) 

where 5 is defined in \2.b)) . and 

I (R[ k) -i \ 2 \ 

E (/ WW - » {k) ) 



lim inf E 



V 



n=l 

,,ak 



> . 



(2.12) 



If a > 5 In (1/5) andrrik = [e ], then for almost all realizations of the chains 
(X^) neZ , k = 1,2. . ., we have 



(k)* 



_£<*)) ^A/-(0,1) 



as k tends to +oo. 



3 Preliminary results 

We first introduce some shorthand notation. We define 

R (k) -1 

z\ k) =E(/ - A (fc) ) , 

and its bootstrap version is given by 

Z ( k > = j2 (f - A (fc) ) • 



Note that zf Y = zf {k) . 



We use the shorthand E*( • ) to denote E ( ■ \X\ \ . . . , X^ k) ) and Var* 



m fe 



to denote Var I • |x[ , . . . ,X^ k) ) . We recall that, in both cases, the ex- 



pectation is taken with respect to the sequence i = 1, . . . of i.i.d. 

random variables uniformly distributed in the set {1, . . . , m^}. 
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Lemma 3.1. The following equalities hold 

E* = , 

and 

i=i J i=i 

Proof. By definition 

E* [zf } *) = J2 zi n ] V = n) = — J2 Z n ] = • (3-1) 

n=l n=l 

The second equality follows by a similar computation. ■ 



It is convenient to introduce a new family of random variables Z\ , where 
i = 1, . . . , rrik, defined as follows 



r(k) . 

'i - z 



These random variables are not only identically distributed (as it was already 
the case for (Z^)), but also they are independent and have zero mean. 
Moreover the following relation holds 

z ( k) = + ^ (fc) _ . (fc )j _ ^ (3 3) 

We define D, (fc } = i?{ fe) - fl^ (recall that R {k) = 1). Similarly, we define 

jj{k)* _ _ j^(k)* 

Lemma 3.2. There is a positive constant C independent of k such that 

\z[ k) \ < CD (k) , and \z[ k) \ < CD {k) . 

Proof. This result follows immediately from the fact that the observable / 
has finite range. ■ 
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Lemma 3.3. There is a constant C > such that, for any k > 1, the 
following inequality holds 



E ( - pW) 2 ) < C V 



Proof. By definition we have 

2^z=i -^z 

and therefore, using the Markov property and the stationarity of the chain, 
we have 

E 2 ) = (3.4) 

m k E V 1 9 + m k (m k - 1)E x - 2 7r ^ . (3.5) 

Since Ez=i Z?j > m k , and using Lemma E21 we conclude that the first term 
in the right hand side of expression ()3.4j) is bounded above by 

K((D[ k) ) 2 ) 

C — ^ J - (3.6) 

m k 

where C > is a constant independent of 

To obtain an upper bound for the second term on the right hand side of 
expression (|3.4j) . we first observe that for m k > 4 we have 



E{ (Ekhv E \(Ekky ] (3 ' 7) 



Z {k) Z {k) (D {k) + D^) 2 \ _ 2E / Z[ k) Z { 2 k) (D[ k) + Dj k) ) 

Z=3 "^Z 



The independence of , an d Ei=3 imply that 



E I i 2 7rrT |=0. 

(EISA ) 
10 



Using again Lemma 13.21 Holder's inequality and Dj > 1, we deduce 
that the sum of the absolute values of the two remaining terms of the right 
hand side of expression (|3.7j) is bounded above by 



C ^ | ^ 



nr 



mi 



(3.8) 



where C is a positive constant independent of k. Since > 1, inequalities 
(|3.6j) and (J3.8J) conclude the proof. ■ 



Lemma 3.4. For any integer k and any positive real number t the following 
inequality holds 

v(D[ k) >t) < (l-5 k f k] . 
Proof. We observe that 



Wk] 



Now we rewrite the right-hand side of the above inequality, by conditioning 
on the values of the initial k symbols 



[t/k] 



Y {h) - n 



The second factor in the above sum can be rewritten as 

[t/h]-i 



1-PU 



(A;) _ 
[t/k]k+l,([t/k]+l)k ~ ai ' k 

[t/fc]-l 



n ^ a i,fe} n { x ^ k) k - ^ 

3=1 



X 



A l, fc - a l,k 



Using I27TT1 this last expression can be bounded above by 

\t/k]-i 



A l,fc - a M 
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The lemma now follows by recursion. ■ 



Lemma 3.5. There exists a positive constant C , such that for any positive 
integer r and any positive integer k, the following inequality holds 

E((DfY) <>*'(£)". 
Proof. The result follows immediately from Lemma f3. 41 ■ 



4 Proof of Theorem 12.2 



We can now start the proof of Theorem 12.21 We first observe that 



{^ k) * - fi {k) ) = ^ i=1 1 . (4.1) 



We want to prove that the right hand side of 14. II converges in distribution 
to a standard normal distribution, when k — > +oo. By the Lindeberg-Feller 
Central Limit Theorem for double arrays (see, for instance, Billingsley 1999), 
this will follow once we show that for any e > 



(fc)* x 2 



E* 1 



(^'*) 2 > £m ,Var'( 2 f") 



lim i ^ ^ U- = . (4.2) 

Yar*(z[ k> ) 

Using Lemma f3. II we can rewrite (|4.2|) as 

lim = ° • (4 ' 3) 



Since 



W x2 



1 {(*n 2 >^ (^>) 2 } - e Er=i (zl k) ) 2 ' (4 ' 4) 
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the fraction at the left-hand side of expression 14.31 is bounded above by 



(4.5) 



To prove that expression (|4.5j) vanishes as k diverges, we will obtain a se- 
quence of almost sure upper bounds for its numerator and a sequence of 
almost sure lower bounds for its denominator. 

Lemma 4.1. For any a > and for any v > 1 + 41n(l/5)/a ; if = [e ak ~\ , 
then for almost all samples the upper-bound 

£ zf > <m\ 

i=l 

holds, for all k large enough. 

Proof. Markov's inequality and Lemmas 13.21 and 13.51 imply that 

\i=l / k 

Ck A 



where C > does not depend on k. Since by hypothesis a(v — 1) > 
4 ln(l/(5), we conclude that the right hand side of expression (J4.7)) is summable. 
This together with the Borel-Cantelli Lemma concludes the proof of the 
lemma. ■ 

The next step is to find a lower bound for the denominator. 

Lemma 4.2. For any a > 41n(l/£) ; and for any summable sequence of non 
negative real numbers i] k , k = 1,2, ... , if m k = [e ak ~\ , then, for almost all 
samples, the lower bound 



E 

i=l 



holds , for all k large enough. 
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Proof. To simplify the notation, let us call 



By definition we have 



t=i 



E (JU (fc) ) = m fc E I (z{ fe) 



Using the fact that the random variables 



are independent, identically distributed and have zero mean we get 



E ((W^)) 2 ) = m k (m k - 1) (E ( ) ) + m fc E ( 



(4.8) 



(4.9) 



Using the inequality of Paley-Zygmund, for < r] < 1, together with the 
identities ()4.8|) and (|4.9jl we obtain the inequality 



F(W {k) > r]E(W {k) )) > 



1 - rj) m\ E \Z\ 



m fc E 



The right hand-side of the above expression can be rewritten as 



( 



\ 



i 



E [Z\ 



V 1 



Therefore Lemma 13.21 and Hypothesis 12.121 imply that 



P (W {k) > 7]E(W {k) )) > (1 - r]f 



( 1 ceUd^Y 1 
1 + — ^ '- 



m k 



(4.10) 



(4.11) 
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where C > does not depend on k. From this it follows immediately that 



P (W (k) < r]E(W (k) )) < 



1 



CE (D\ 



2rj — if 



1--^ + 



CE \D\ 



(4.12) 



Lemma and the choice of a imply that the quantity 

1 c E ((^») 4 ) 



1 

< - 
~ 2 



(4.13) 



for k large enough. Therefore inequality (|4.12|) implies that 



P (W {k) < r]E(W (k) )) < 2 



+ 4:7], 



(4.14) 



for k large enough. Using again Lemma f3. 51 it follows from ()4.14|) that 



+oo 



J^P {W ik) < r] k E(W {k) )) < +oo 



(4.15) 



k=X 



for any summable sequence of non negative real numbers rjk, k = 1,2, 
As a consequence, the Lemma of Borel-Cantelli implies that 



i=l ^ 



(4.16) 



almost surely for k large enough. 



Lemma 4.3. For any a > 41n(l/5), if m k = [e ak ~\, then, for almost all 
samples, the following limit holds 



lim 
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Proof. The result follows at once from Lemmas 14.11 and 14.21 and the Borel- 
Cantelli Lemma by taking 1 + 41n(l/5)/a < v < 2 and, for instance, 

Vk = i/k 2 . m 

The expression in the statement of the above lemma is similar to (j4.5|) 

(k) ~ (k) 

with Z\ replaced by Z\ in the denominator. Therefore to conclude the 
proof of Theorem 12.21 we need the following lemma. 

Lemma 4.4. For any a > 51n(l/<5) ; if = [e ak ~\, then, for almost all 
samples, the following limit holds 

inn 1=1 [l ' =1 

Proof. An elementary computation shows that for any real numbers a and 
b, and for any e > one has 

(1 - e)a 2 + (1 - e-^b 2 < (a + b) 2 < (1 + e)a 2 + (1 + e'^b 2 . 

~ (k) 

We apply this inequality for each I = 1, . . . , with a = Z\ , and b = 
- ^ k ))Df \ Summing up over I and using identity (J3.3j) we obtain the 
inequalities 



< 



YZ\ (zl k) ) 2 



where 



c 



(k) 



< l + e + (l + e- 1 )C 



(fc)\2 



(4.17) 



(4.18) 



To conclude the proof it remains to show that converges to zero almost 
surely as k diverges. 

Using Lemma f3. 31 Markov's inequality and the Borel-Cantelli Lemma, it 
follows immediately that for any summable sequence of positive numbers p k , 
k > 1, and for almost all samples, the following inequality holds 



(^)-^)) 2 < 



C 
Pk 



E 



E ( (d™ 



mi 



(4.19) 
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for all k large enough, where C is a positive constant independent of k. We 
also observe that for the same sequence p k the inequality 

f;(A (fe) ) 2 <— E((^S fe) ) ) , (4.20) 

Z=l Pk \ J 

holds almost surely for all large enough. 

Combining Lemma 14.21 and Hypothesis 12.121 we conclude that for any 
summable sequence T]k, k > 1, and for almost all sample, the following in- 
equality holds 

J2{zl k) ) 2 >Cm kVk , (4.21) 
l=i 

for all k large enough, where C is a strictly positive constant independent of 
k. 

Using inequalities ([4. 19)1 . ()4.20|) . (|4.21|) . and using Lemma 13*31 we deduce 
that for almost all samples, the following inequality holds 

p -fc(a-51n(l/5)) 

c w < C - = - 

PlVk 

for all k large enough, where C is a positive constant independent of k. Since 
by hypothesis, a > 51n(l/<5), it is enough to take for instance p k = r] k = 1/k 2 
to conclude converges to zero almost surely. Recalling that inequality 
(J4.17)) holds for any fixed e > 0, the lemma follows. ■ 

Combining Lemmas 14.31 and 14.41 it follows that almost surely 

lim ^' =a 1 ' — , = . (4.22) 

fe ^(£S(^ ) 2 ) 

This implies (J4.2j) and finishes the proof of Theorem 12.21 



5 Proof of Theorem 12. 1L 

The basic idea of the proof is to approximate the chain of infinite order by 
a sequence of Markov chains of increasing order satisfying the hypotheses of 
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Theorem 12.21 We will use for this purpose the canonical Markov approxi- 
mation (Xn^)nez °f t ne chain (X n ) ne z which is the Markov chain of order k 
whose transition probabilities are defined by 

pW{b | ai, . . . , a k ) := F(X k+1 = b\Xj = a h 1 < j < k) (5.1) 

for all integer k > 1 and a\, . . . , a/-, b G A. 

From now on we only consider stationary chains. The sequence of sta- 
tionary canonical Markov approximations can be constructed together with 
the stationary chain of infinite order on the same probability space (Q, J 7 , P). 
In particular they can be constructed together using the well-known maxi- 
mal coupling(see, for instance, Appendix A.l in Barbour Hoist and Janson, 
1992). For details of this construction in the present context we refer the 
reader to Fernandez and Galves (2002). 

Before starting the proof of Theorem 12 .11 we will recall a few results from 
the literature which will be used in the sequel. The following theorem was 
proven in Fernandez and Galves (2002). 

Theorem. Let {X n ) n& i he a chain of infinite order on the finite alphabet A 
and satisfying the conditions 

y inf p(a\x-i, X-2, ■ ■ •) > and > < +oo . 

aeA 1>1 

Then the construction of the chains using the maximal coupling satisfies the 
following inequality 

p{AfV^o}<&. (5.2) 



The following theorem is a particular case of the main theorem of Bres- 
saud, Fernandez and Galves (1999). For convenience of the reader we will 
reformulate the result in the framework in which it will be used in the proofs 
below. 

Theorem. If hypotheses Hi and H 2 are satisfied then the chain (X n ) ne z is 
exponentially if -mixing. 

For a definition of yj-mixing chains we refer the reader to Billingsley 
(1999). To make the connection between the present hypotheses and the 
assumptions of Bressaud et al. (1999) we note that hypotheses Hi and H 2 
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imply that the sequence of log-continuity rates (7^) defined by 



7z = max sup 

%i=yi , i=—l,—,—i 

is exponentially decreasing and therefore satisfies the hypotheses of this pa- 
per. 

We can now start the proof of Theorem 12.11 First of all we will use the 
above mentioned result by Fernandez and Galves (1999) to obtain an up- 
per bound for the probability of discrepancies in the first r symbols for the 
coupled realizations of the chain (X n ) ng ^ and its canonical Markov approxi- 
mation of order k (Jf„ ) ne %. More precisely let us define 

AM := {xl k] =X t ,t = l...,r}, 

which is the set of coincidence up to time r of the chains (Xn fe ') ngZ and 

Lemma 5.1. Let (X n ) ng ^ be a chain of infinite order satisfying conditions 
Hx and H 2 with fli summable. The there exists a positive constant C such 
that 

P{(AW) C } <CrP k 

We will now check that the hypotheses of Theorem 12.21 are satisfied by 
the sequence of canonical Markov approximations (xl fc ') ngz , k > 1. 

Lemma 5.2. Under assumption Hi we have 

mi{5 [k] : k > 1} > 5 , 

where 

5^ = mm inf p^(a\x-i, . . . , . 

Proof. Follows at once from the properties of the conditional probability. 
■ 

This lemma establishes condition (j2.11|) . The proof that condition ()2.12|) 
holds follows from the next three lemmas. Let us define 

Ri(k)-1 R? ] -i 

m)= E (/(*») -aO and z l k] = E (/W 1 )-^* 1 ). 

n=R i - 1 (k-l) n=R [ ^ 1 



p(a\X-i,X- 2 , ■ ■ •) 

p(a\y-i, y- 2 , ■ ■ •) 
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where is denned as in expression ()2.7|) using the chain (Xn )nez an d 
^=E(f(X?)). 

Lemma 5.3. Under Hypotheses Hi, H2 and H3 the chain (X n ) n( zz satisfies 
the inequality 

liminfE((Z 1 (A;)) 2 ) > . 



Proof. Markov's inequality implies that 

E({Z 1 {k)) 2 )>u 2 E{{Z l {k)) 2 >u 2 } , 
for any real number u. Recalling that R\{k) > 1, we obtain the lower bound 

E ((^) (5,, 

By the above mentioned theorem from Bressaud et al. (1999), the process 
(f(X n )) n is exponentially 93-mixing. Therefore it follows from classical results 
on the Central Limit Theorem (cf. for instance Theorems 20.1 and 20.3 from 
Billingsley 1999) 

^tL^N^o- 2 ) 

as k diverges. Hypothesis H 3 ensures that a > 0. This implies that for any 
fixed u and any k large enough the lower bound provided by inequality (|5.3|) 
is greater than a fixed strictly positive real number. This concludes the proof 
of the lemma. ■ 

We define D,(Jfe) = Ri(k) - Ri-^k). 

Lemma 5.4. For any integer k, any integer r < 4 and any positive real 
number t the following inequalities hold 

F (D 1 (k) > t) < (1 - 5 k ) [t/k] and E ((D 1 (k)) r ) < Ck r (~\ . 
where C is a positive constant. 

Proof. The proof is exactly the same as the proofs of Lemmas 13.41 and 13.51 
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Lemma 5.5. Under the conditions of Theorem \2.1\ the sequence of canonical 
Markov approximations satisfies the inequality 

liminfE {{zf^J > . 

Proof. We will first derive an upper bound for the the modulus of the 
difference 

E ((^(fc)) 2 - (4 fc] ) 2 ) | = |E {{Z^k) - zf){Z x {k) + Zf\ 
The finiteness of the alphabet A implies that 



< c 



[k] 



(5.4) 



where C = max{|/(a)| : a G A}. We observe also that 



zm - zf < J2 Y n ] \ + c Rm - r[ 



Ik] 



n=l 



,[*] 



(5.5) 



where Y n = f (X n ) - // and Y^ k] = f (x^ 1 ) - 

In the sequel we will no longer specify the different positive constants 
appearing in the various estimates. Moreover they will be all denoted by the 
letter C. Combining inequalities (|5.4|) and (J5.5|) we obtain 

E {{Z^k)) 2 - (Zf 1 ) 2 ) | < CE (jR^h) - itf| + Rf ] \) 



'Ri(k)AR l { 



+CE J \ Y n-Yl k] \\Ri(k)+Rl 



Mi 



(5.6) 



n=l 



We will estimate separately each term. For the second term we have 



'Ri(k)AR{ 



E 



J2 {Yn-Y^WR^ + Rf 



Ml 



n=l 



R!(k)AR l ; 



E I 1 W)e 



Ml 



n=l 
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<CE(l (A M )c (i^A;)+4 fe1 ) 2 



Using Schwarz inequality and Lemmas 13.51 15. II and 15 . 41 we obtain the upper 
bound 



E I l (a «, 



1/2 



E ((^(Jfe) + i?f ] ) 4 ) V2 < Ck 5 / 2 (3t /2 5- 21 



We now come to the estimation of the first term in ([5.6)1 . Using Scwharz 
inequality and Lemmas 13.51 and IB~4l we get 



E (\Ri(k) - Rf ] \\Ri(k) + 4 fc] |) < E ((fli(ife) - tff 1 ) 2 ) E + 4' 



1/2 



We now have 

E ((J2 X (A0 - itf ) 2 ) = E (l Afl - M fcl ) 2 )+E (l (A M )0 (JM*) - itf) 

and the last term is estimated as above. For the first term, we have 



E (l^i^A;) - #f ] ) 2 ) <E 



/ 



(R 1 (k) + R^) 



[k]\2 



\ 



Ri{k)AR[ k] +k-l 



1 - 



n 



x\ k] =x 



<E((i2i(ife) + i2 1 * ] ) 4 ) 1/2]E 



// 



i?l(fc)Ai?f , +fc-l 



j=Ri(k)AR,f ] 

\V /2 



j -^3 



11 



n 



y y j=R!(k)AR i i 



Ik) 



xf ] =Xj 



11 



( ( .Ri(fc)A-f4 fe] +fc-l 



< Ck 2 5~ 2k E 



1/2 



1 - 



n 



xf=x, 



11 



\ \ j=Ri(k)AR? 

where we have used again Schwarz inequality and Lemmas 13.51 and 15.41 We 

now have 



E 



Ri(k)AR[ k] +k-l 



1 



n 



\ \ j=Ri(k)AR 



-x^=x 3 



11 



P =i 



{k)AR lk] =p 



p+k-1 

i n i 

3=P 



xf ] =x 3 
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Using Schwarz inequality and stationarity and Lemmas 13.51 15.1l and l5.4l this 
is bounded above by 

1/2 

1/2 



vP=l / 

1/2 / \ 1/2 



(if)' 



<E((i? 1 (A;) + M fc] ) 2 ) E(l (A w )B ) <Ck^5' k (3t /2 
Collecting together the above bounds we get 

E ((^(A;)) 2 - (Zf 1 ) 2 ) | < C (k">/ 2 5- 2k (3 l k 12 + fc 19 / 8 r 9fe / 4 /3 fc 1/8 ) • 

It follows from this inequality and assumption c > 181og<5 _1 that 

lim E (( Zl (k)) 2 - (Z?) 2 ) =0. 

k— >oo \ / 

This together with Lemma 15.31 concludes the proof of the lemma. ■ 



In order to prove Theorem l2.1l we need to construct together the bootstrap 
samples of (X n ) neZ and (X,^) neZ . We recall that we have already assumed 
that (X n ) ne z and (Xn ) ne z are constructed together using the maximal cou- 
pling. Now, given two coupled realizations of theses chains we will use the 
same realization of the sequence of random indices to choose the blocks en- 
tering in the bootstrap samples of the chains. Formally, for every fixed k > 1 
the bootstrap blocks will be defined as 

#(*) = &,(*)(*) ^d tf ]m = t% h) 

where Ii(k), . . . , I mk (k) are the same independent random variables with uni- 
form distribution in the set {1, . . . , m^}. 

The next lemma says that the coupled samples of (X n ) neZ and (Xn^)nez 
coincide up to time R mk (k) with overwhelming probability. 

Lemma 5.6. Under the hypotheses of Theorem \2.1\ we have 

hm P((A Rmfc(fc) ) c )=0. 
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Proof. We observe that for any r > we have 

p (( A JW*)) C ) <P((A,) c )+P(iC fc (A0>r) . (5.7) 

By Lemma 15. li the first term in the right hand side of (|5.7jl is bounded above 
by Cr(3 k . 

It follows from Lemmas 15.41 and 15.21 that the second term of the right 
hand side of ()5.7|) is bounded above by 

m k F(D 1 (k) > r/m k ) <m k (l- 5 k ) [r/[kmk)] . 

We now set r = Xk 2 m k 5~ k , where A is a fixed number strictly larger than 
a. With this choice of r the two terms in inequality ()5.7|) tends to when k 
diverges. This concludes the proof of the lemma. ■ 

We can now conclude the proof of Theorem 12.11 First of all we observe 
that 



°l °l V R l Z - l 



R* (k) - 1 JR* (k) - 1 



Lemma ()5.6|) ensures that last two terms are equal to zero with probability 
tending to 1 when k tends to infinity. Theorem 12.21 implies 



Finally we observe that Lemma (|5.(jj) ensures that 



<7M* IR*m k (k)-l 



Inn P ( '/-^ = 1 



This concludes the proof of Theorem 12.11 
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